5 research outputs found
FedCLIP: Fast Generalization and Personalization for CLIP in Federated Learning
Federated learning (FL) has emerged as a new paradigm for privacy-preserving
computation in recent years. Unfortunately, FL faces two critical challenges
that hinder its actual performance: data distribution heterogeneity and high
resource costs brought by large foundation models. Specifically, the non-IID
data in different clients make existing FL algorithms hard to converge while
the high resource costs, including computational and communication costs that
increase the deployment difficulty in real-world scenarios. In this paper, we
propose an effective yet simple method, named FedCLIP, to achieve fast
generalization and personalization for CLIP in federated learning. Concretely,
we design an attention-based adapter for the large model, CLIP, and the rest
operations merely depend on adapters. Lightweight adapters can make the most
use of pretrained model information and ensure models be adaptive for clients
in specific tasks. Simultaneously, small-scale operations can mitigate the
computational burden and communication burden caused by large models. Extensive
experiments are conducted on three datasets with distribution shifts.
Qualitative and quantitative results demonstrate that FedCLIP significantly
outperforms other baselines (9% overall improvements on PACS) and effectively
reduces computational and communication costs (283x faster than FedAVG). Our
code will be available at: https://github.com/microsoft/PersonalizedFL.Comment: Accepted by IEEE Data Engineering Bulletin; code is at:
https://github.com/microsoft/PersonalizedF
Frustratingly Easy Model Generalization by Dummy Risk Minimization
Empirical risk minimization (ERM) is a fundamental machine learning paradigm.
However, its generalization ability is limited in various tasks. In this paper,
we devise Dummy Risk Minimization (DuRM), a frustratingly easy and general
technique to improve the generalization of ERM. DuRM is extremely simple to
implement: just enlarging the dimension of the output logits and then
optimizing using standard gradient descent. Moreover, we validate the efficacy
of DuRM on both theoretical and empirical analysis. Theoretically, we show that
DuRM derives greater variance of the gradient, which facilitates model
generalization by observing better flat local minima. Empirically, we conduct
evaluations of DuRM across different datasets, modalities, and network
architectures on diverse tasks, including conventional classification, semantic
segmentation, out-of-distribution generalization, adverserial training, and
long-tailed recognition. Results demonstrate that DuRM could consistently
improve the performance under all tasks with an almost free lunch manner.
Furthermore, we show that DuRM is compatible with existing generalization
techniques and we discuss possible limitations. We hope that DuRM could trigger
new interest in the fundamental research on risk minimization.Comment: Technical report; 22 page
On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective
ChatGPT is a recent chatbot service released by OpenAI and is receiving
increasing attention over the past few months. While evaluations of various
aspects of ChatGPT have been done, its robustness, i.e., the performance to
unexpected inputs, is still unclear to the public. Robustness is of particular
concern in responsible AI, especially for safety-critical applications. In this
paper, we conduct a thorough evaluation of the robustness of ChatGPT from the
adversarial and out-of-distribution (OOD) perspective. To do so, we employ the
AdvGLUE and ANLI benchmarks to assess adversarial robustness and the Flipkart
review and DDXPlus medical diagnosis datasets for OOD evaluation. We select
several popular foundation models as baselines. Results show that ChatGPT shows
consistent advantages on most adversarial and OOD classification and
translation tasks. However, the absolute performance is far from perfection,
which suggests that adversarial and OOD robustness remains a significant threat
to foundation models. Moreover, ChatGPT shows astounding performance in
understanding dialogue-related texts and we find that it tends to provide
informal suggestions for medical tasks instead of definitive answers. Finally,
we present in-depth discussions of possible research directions.Comment: Technical report; code is at:
https://github.com/microsoft/robustlear
Chinese expert consensus on the diagnosis and treatment of malignant pleural mesothelioma
Abstract Malignant pleural mesothelioma (MPM) is a malignant tumor originating from the pleura, and its incidence has been increasing in recent years. Due to the insidious onset and strong local invasiveness of MPM, most patients are diagnosed in the late stage and early screening and treatment for high‐risk populations are crucial. The treatment of MPM mainly includes surgery, chemotherapy, and radiotherapy. Immunotherapy and electric field therapy have also been applied, leading to further improvements in patient survival. The Mesothelioma Group of the Yangtze River Delta Lung Cancer Cooperation Group (East China LUng caNcer Group, ECLUNG; Youth Committee) developed a national consensus on the clinical diagnosis and treatment of MPM based on existing clinical research evidence and the opinions of national experts. This consensus aims to promote the homogenization and standardization of MPM diagnosis and treatment in China, covering epidemiology, diagnosis, treatment, and follow‐up